 /*
Purpose: 	Append data from 1987 Census of Gov and 1987-2019 Census Boundary and Annexation Survey (BAS) on Municipal Year of Incorporation
Author: 	Kiara Wyndham-Douds
Dataset: 	U.S. Dates of Municipal Incorporation 
Date:		July 6, 2022
*/

/* This do file appends two datasets: 
1. cog_1987_munis_wplacecodes_wnhgiscode.dta - data from 1987 Census of Governments
	with year of incorproation for every municipality
2. bas_dec_incorp_1987_2019_nhgiscode.dta - data from 1987-2019 Census Boundary 
	and Annexation Survey (BAS) for new incorporations during this time period.
	
Note: This final appended data file does NOT have information regarding municipalities
that incorporated and then subsequently unincorporated or merged/were annexed
into other municipalities prior to 1987. Thus, it should be thought of as a 
record of municipal incorporations for municipalities that existed as incorporated
municipalities in 1987 or later. 
*/

use cog_1987_munis_wplacecodes_wnhgiscode.dta
append using bas_dec_incorp_1987_2019_nhgiscode.dta

*check for duplicates since both files have 1987 

distinct nhgisplace // all unique
*  16329 observations, 16311 distinct

*identify duplicates
sort nhgisplace
quietly by nhgisplace:  gen dup = cond(_N==1,0,_n)
tab dup

sort nhgisplace
list nhgisplace Name place_name_bas year_incorp dup if dup>0
**in every case, COG date of incorp is earlier than BAS date. keeping COG
drop if dup>0 &  place_name_bas!=""
*18 dropped
drop dup

*identify data source
gen data_source=""
replace data_source="1987 COG" if Name!=""
replace data_source="1987-2019 BAS" if place_name_bas!=""
tab data_source,m

*create combined place name variable
drop place_name
gen place_name=Name
replace place_name=place_name_bas if place_name==""
codebook place_name

*keep needed variables
keep nhgisplace gisjoin year_incorp data_source place_name
export delimited using "muni_yr_incorp", replace


*histogram of year of incorp
hist year_incorp if  year_incorp>1799, freq width(1)  ///
	xlabel(1800(20)2010) xmtick(##5) start(1800) lcolor("110 137 194") fcolor("110 137 194") 
